Example-based Rescoring of Statistical Machine Translation Output

نویسندگان

  • Michael Paul
  • Eiichiro Sumita
  • Seiichi Yamamoto
چکیده

Conventional statistical machine translation (SMT) approaches might not be able to find a good translation due to problems in its statistical models (due to data sparseness during the estimation of the model parameters) as well as search errors during the decoding process. This paper1 presents an example-based rescoring method that validates SMT translation candidates and judges whether the selected decoder output is good or not. Given such a validation filter, defective translations can be rejected. The experiments show a drastic improvement in the overall system performance compared to translation selection methods based on statistical scores only.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lattice rescoring methods for statistical machine translation

Modern statistical machine translation (SMT) systems include multiple interrelated components, statistical models, and processes. Translation is often factored as a cascaded series of modules such that the output of one module serves as the input to the next; this is the SMT pipeline. Simplifying assumptions, limited training data, and pruning during search mean that the hypothesis produced by ...

متن کامل

R ’ s Machine Translation System for IWSLT 2009

In this paper, we describe the system and approach used by the Institute for Infocomm Research (IR) for the IWSLT 2009 spoken language translation evaluation campaign. Two kinds of machine translation systems are applied, namely, phrase-based machine translation system and syntax-based machine translation system. To test syntax-based machine translation system on spoken language translation, va...

متن کامل

I2r's machine translation system for IWSLT 2009

In this paper, we describe the system and approach used by the Institute for Infocomm Research (IR) for the IWSLT 2009 spoken language translation evaluation campaign. Two kinds of machine translation systems are applied, namely, phrase-based machine translation system and syntax-based machine translation system. To test syntax-based machine translation system on spoken language translation, va...

متن کامل

The University of Washington machine translation system for IWSLT 2006

This paper describes the University of Washington’s submission to the IWSLT 2006 evaluation campaign. We present a multi-pass statistical phrase-based machine translation system for the Italian-English open-data track. The focus of our work was on the use of heterogeneous data sources for training translation and language models, the use of several novel rescoring features in the second pass, a...

متن کامل

Towards Improving English-Latvian Translation: A System Comparison and a New Rescoring Feature

This paper presents a comparative study of two alternative approaches to statistical machine translation (SMT) and their application to a task of English-to-Latvian translation. Furthermore, a novel feature intending to reflect the relatively free word order scheme of the Latvian language is proposed and successfully applied on the n-best list rescoring step. Moving beyond classical automatic s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004